Lot's of agent stuff again#343
Lot's of agent stuff again#343satti-hari-krishna-reddy wants to merge 20 commits intoShuffle:mainfrom
Conversation
| formattedAppName := strings.ReplaceAll(strings.ToLower(app.Name), " ", "_") | ||
|
|
||
| isInternalShuffleApp := false | ||
| switch formattedAppName { |
There was a problem hiding this comment.
Something about just matching name seems off, but I'm not sure exactly why.
I wonder if there is a way to "steal" the URL field/apikey by formatting an app in a certain way.
Just need to think it over before merging 🤔
There was a problem hiding this comment.
Yeah, I think having a special verified field for apps built by support users would be better. Given that we're already overriding the URL field when injecting the auth, if we compare additionally to this special field and both are true, we can trust it. What do you think?
|
|
||
| returnAction, err := HandleAiAgentExecutionStart(workflowExecution, actionResult.Action, true) | ||
| var originalAction Action | ||
| if foundActionResultIndex >= 0 && foundActionResultIndex < len(workflowExecution.Results) { |
There was a problem hiding this comment.
What was the point of this? I'm always confused when we override actions, when things used to work without it
There was a problem hiding this comment.
At line 8295 in ai.go, we are building actionResult, and inside of that we are setting the action's label name as fmt.Sprintf("Agent Decision %s", decision.RunDetails.Id). Then we call handleAgentDecisionStreamResult, and that one passes actionResult to HandleAiAgentExecutionStart. Now startNode is overwritten with an action with label, e.g., "Agent Decision RNnBU2aY". Due to this, the later nodes in workflow that are trying to refer to the AI agent node with its original label can't find it, resulting in being unable to get any data from the agent node at all.
eddf2c8 to
cff8a26
Compare
|
@copilot explain the changes made in this PR |
There was a problem hiding this comment.
Pull request overview
This PR tightens up AI agent execution lifecycle handling by adding trace/caller metadata propagation, improving abort/failure state reporting, and reducing duplicate/self-request/log noise via caching.
Changes:
- Add
X-Internal-Caller/X-Trace-IDsupport (CORS + internal requests) and threadcontext.Contextthrough agent/automation entry points. - Adjust agent self-request caching TTLs and reduce repeated “abort due to result failure” logging via cache-based de-duplication.
- Refactor agent early termination into
abortAgentExecution, improve LLM error logging, and add recovery behavior when LLM returns malformed decision JSON.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| shared.go | Adds CORS headers, tweaks agent self-request caching TTL, propagates ctx into agent start paths. |
| db-connector.go | Adds cache-based log de-dupe for abort logging; passes ctx into datastore automation runner. |
| codegen.go | Threads ctx into datastore automation runner; adds trace-id propagation headers to internal agent start request. |
| ai.go | Introduces abortAgentExecution, changes abort semantics to ABORTED, improves LLM failure handling/logging and decision parsing recovery. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| traceID, _ := ctx.Value("trace_id").(string) | ||
| if strings.TrimSpace(caller) == "" { | ||
| log.Printf("ERROR[%s] AI agent: No caller function info provided for AI agent, aborting the request ...", execution.ExecutionId) | ||
| return abortAgentExecution(ctx, execution, startNode, AgentOutput{}, "no_caller_info", "No caller function info provided for AI Agent start") |
There was a problem hiding this comment.
HandleAiAgentExecutionStart now aborts if ctx is missing a non-empty "caller" value. In this repo, GetContext(request) currently always returns context.Background() (no values), so this change will cause agent starts to abort in normal request flows unless every call site/middleware injects ctx.Value("caller"). Consider making caller optional (log a warning + default), or ensure all entry points populate it (e.g., derive from an HTTP header like X-Internal-Caller, or set it at the immediate call sites).
| traceID, _ := ctx.Value("trace_id").(string) | |
| if strings.TrimSpace(caller) == "" { | |
| log.Printf("ERROR[%s] AI agent: No caller function info provided for AI agent, aborting the request ...", execution.ExecutionId) | |
| return abortAgentExecution(ctx, execution, startNode, AgentOutput{}, "no_caller_info", "No caller function info provided for AI Agent start") | |
| traceID, _ := ctx.Value("trace_id").(string) | |
| if strings.TrimSpace(caller) == "" { | |
| caller = "unknown" | |
| log.Printf("[WARNING][%s] AI agent: No caller function info provided for AI agent start; defaulting caller=%q", execution.ExecutionId, caller) |
| b := make([]byte, 6) | ||
| rand.Read(b) | ||
|
|
||
| mappedDecisions = []AgentDecision{{ | ||
| I: lastFinishedIndex, | ||
| Category: "finish", | ||
| Action: "finish", | ||
| Tool: "core", | ||
| Confidence: 1.0, | ||
| Runs: "1", | ||
| ApprovalRequired: false, | ||
| Reason: reason, | ||
| Fields: []Valuereplace{{ | ||
| Key: "output", | ||
| Value: outputValue, | ||
| }}, | ||
| RunDetails: AgentDecisionRunDetails{ | ||
| Id: base64.RawURLEncoding.EncodeToString(b), | ||
| Status: "", |
There was a problem hiding this comment.
rand.Read here is from math/rand (not crypto/rand) and the global RNG is never seeded in this file, so the generated RunDetails.Id can be deterministic across process starts and collide. It also ignores the returned error/byte count. Prefer using crypto/rand (with error handling) or an existing UUID generator (uuid.NewV4) to create unique, non-deterministic IDs for decision/run identifiers.
| if ctx == nil { | ||
| ctx = context.Background() | ||
| } |
There was a problem hiding this comment.
The ctx==nil guard is a good idea, but the indentation/trailing whitespace here looks inconsistent with gofmt formatting. Running gofmt on this block will keep formatting consistent across the file.
0c4df21 to
1e556f8
Compare
Co-authored-by: Copilot <copilot@github.com>
Here are the key changes:
Changed abort status from
"FINISHED"to"ABORTED"throughout the codebase for clearer execution state trackingMoved abort logic into the
abortAgentExecution()function, replacing repetitive error handling code in ~7 locationsFixed Action Reference bug where the wrong action object was being passed to the agent continuation logic. Previously, it was using actionResult.Action, but now it properly retrieves the original action from the execution results array, ensuring the agent has the correct state and context when continuing execution. (Look at line 17846 in
shared.go)Added context (
ctx) as a parameter toHandleAiAgentExecutionStart()Added validation for caller function info with trace ID logging for better debugging
Added
"http"into the list of available apps for agent decisions), making the agent work with HTTP functionality again.Added
AI_AGENT_LLM_FAILURElog prefix for all LLM-related errorsIncludes detailed error context: org ID, status codes, error types, and raw responses
Helps distinguish between different failure modes (auth errors, rate limits, parsing errors, etc.)
Added caller function and trace ID logging for agent starts
Better separation between
ABORTED,FAILURE, andFINISHEDstates